View INTEL387SX_3357339.PDF datasheet online --- IC-ON-LINE

Datasheet File OCR Text:

* other brands and names are the property of their respective owners. information in this document is provided in connection with intel products. intel assumes no liability whatsoever, including infringement of any patent or copyright, for sale and use of intel products except as provided in intel's terms and conditions of sale for such products. intel retains the right to make changes to these specifications at any time, without notice. microcomputer products may have minor variations to this specification known as errata. january 1994 copyright ? intel corporation, 1995 order number: 240225-009 intel387 tm sx math coprocessor y new automatic power management e low power consumption e typically 100 ma in dynamic mode, and 4 ma in idle mode y socket compatible with intel387 family of math coprocessors e hardware and software compatible e supported by over 2100 commercial software packages e 10% to 15% performance increase on whetstone and livermore benchmarks y compatible with the intel386 tm sx microprocessor e extends cpu instruction set to include trigonometric, logarithmic, and exponential y high performance 80-bit internal architecture y implements ansi/ieee standard 754-1985 for binary floating-point arithmetic y available in a 68-pin plcc package see intel packaging specification, order y 231369 the intel387 tm sx math coprocessor is an extension to the intel386 tm sx microprocessor architecture. the combination of the intel387 tm sx with the intel386 tm sx microprocessor dramatically increases the process- ing speed of computer application software that utilizes high performance floating-point operations. an internal power management unit enables the intel387 tm sx to perform these floating-point operations while maintain- ing very low power consumption for portable and desktop applications. the internal power management unit effectively reduces power consumption by 95% when the device is idle. the intel387 tm sx math coprocessor is available in a 68-pin plcc package, and is manufactured on intel's advanced 1.0 micron chmos iv technology. 240225 22 intel386 and intel387 are trademarks of intel corporation. 1
intel387 tm sx math coprocessor contents page 1.0 pin assignment 5 1.1 pin description table 6 2.0 functional description 7 2.1 feature list 7 2.2 math coprocessor architecture 7 2.3 power management 8 2.3.1 dynamic mode 8 2.3.2 idle mode 8 2.4 compatibility 8 2.5 performance 8 3.0 programming interface 9 3.1 instruction set 9 3.1.1 data transfer instructions 9 3.1.2 arithmetic instructions 9 3.1.3 comparison instructions 10 3.1.4 transcendental instructions 10 3.1.5 load constant instructions 10 3.1.6 processor instructions 11 3.2 register set 11 3.2.1 status word (sw) register 12 3.2.2 control word (cw) register 15 3.2.3 data register 16 3.2.4 tag word (tw) register 16 3.2.5 instruction and data pointers 16 3.3 data types 18 3.4 interrupt description 18 3.5 exception handling 18 3.6 initialization 21 3.7 processing modes 21 3.8 programming support 21 contents page 4.0 hardware system interface 21 4.1 signal description 22 4.1.1 intel386 cpu clock 2 (cpuclk2) 22 4.1.2 intel387 math coprocessor clock 2 (numclk2) 22 4.1.3 clocking mode (ckm) 23 4.1.4 system reset (resetin) 23 4.1.5 processor request (pereq) 23 4.1.6 busy status (busy y ) 23 4.1.7 error status (error y ) 23 4.1.8 data pins (d15 d0) 23 4.1.9 write/read bus cycle (w/r y ) 23 4.1.10 address stobe (ads y ) 23 4.1.11 bus ready input (ready y ) 24 4.1.12 ready output (readyo y ) 24 4.1.13 status enable (sten) 24 4.1.14 math coprocessor select 1 (nps1 y ) 24 4.1.15 math coprocessor select 2 (nps2) 24 4.1.16 command (cmd0 y ) 24 4.1.17 system power (v cc ) 24 4.1.18 system ground (v ss ) 24 4.2 system configuration 25 4.3 math coprocessor architecture 26 4.3.1 bus control logic 26 4.3.2 data interface and control unit 26 4.3.3 floating point unit 26 4.3.4 power management unit 26 2 2
contents page 4.4 bus cycles 26 4.4.1 intel387 sx math coprocessor addressing 27 4.4.2 cpu/math coprocessor synchronization 27 4.4.3 synchronous/asynchronous modes 27 4.4.4 automatic bus cycle termination 27 5.0 bus operation 27 5.1 non-pipelined bus cycles 28 5.1.1 write cycle 28 5.1.2 read cycle 29 5.2 pipelined bus cycles 29 5.3 mixed bus cycles 30 5.4 busy y and pereq timing relationship 32 6.0 package specifications 33 6.1 mechanical specifications 33 6.2 thermal specifications 33 contents page 7.0 electrical characteristics 33 7.1 absolute maximum ratings 33 7.2 d.c. characteristics 34 7.3 a.c. characteristics 35 8.0 intel387 sx math coprocessor instruction set 41 appendix aeintel387 sx math coprocessor compatibility a-1 a.1 8087/80287 compatibility a-1 a.1.1 general differences a-1 a.1.2 exceptions a-2 appendix becompatibility between the 80287 and 8087 math coprocessor b-1 3 3
contents page figures figure 1-1 intel387 sx math coprocessor pinout 5 figure 2-1 intel387 sx math coprocessor block diagram 7 figure 3-1 intel 386 sx cpu and intel387 math coprocessor register set 11 figure 3-2 status word 12 figure 3-3 control word 15 figure 3-4 tag word register 16 figure 3-5 instruction and data pointer image in memory, 32-bit protected mode format 17 figure 3-6 instruction and data pointer image in memory, 16-bit protected mode format 17 figure 3-7 instruction and data pointer image in memory, 32-bit real mode format 17 figure 3-8 instruction and data pointer image in memory, 16-bit real mode format 18 figure 4-1 intel386 sx cpu and intel387 sx math coprocessor system configuration 25 figure 5-1 bus state diagram 28 figure 5-2 non-pipelined read and write cycles 29 figure 5-3 fastest transition to and from pipelined cycles 30 figure 5-4 pipelined cycles with wait states 31 figure 5-5 busy y and pereq timing relationship 32 figure 7-1a typical output valid delay vs load capacitance at max operating temperature 37 figure 7-1b typical output slew time vs load capacitance at max operating temperature 37 figure 7-1c maximum i cc vs frequency 37 contents page figure 7-2 cpuclk2/numclk2 waveform and measurement points for input/output 38 figure 7-3 output signals 38 figure 7-4 input and i/o signals 39 figure 7-5 reset signal 39 figure 7-6 float from sten 40 figure 7-7 other parameters 40 tables table 1-1 pin cross referencee functional grouping 5 table 3-1 condition code interpretation 13 table 3-2 condition code interpretation after fprem and fprem1 instructions 14 table 3-3 condition code resulting from comparison 14 table 3-4 condition code defining operand class 14 table 3-5 mapping condition codes to intel386 cpu flag bits 14 table 3-6 intel387 sx math coprocessor data type representation in memory 19 table 3-7 cpu interrupt vectors reserve for math coprocessor 20 table 3-8 intel387 sx math coprocessor exceptions 20 table 4-1 pin summary 22 table 4-2 output pin status during reset 23 table 4-3 bus cycle definition 26 table 6-1 thermal resistances ( c/watt) i jc and i ja 33 table 6-2 maximum t a at various airflows 33 table 7-1 d.c. specifications 34 table 7-2a timing requirements of the bus interface unit 35 table 7-2b timing requirements of the execution unit 36 table 7-2c other ac parameters 36 table 8-1 instruction formats 41 4 4
intel387 tm sx math coprocessor 1.0 pin assignment the intel387 sx math coprocessor pinout as viewed from the top side of the component is shown in figure 1-1. v cc and v ss (gnd) connections must be made to multiple pins. the circuit board should include v cc and v ss planes for power distribution and all v cc and v ss pins must be connected to the appropriate plane. note: pins identified as n.c. should remain completely unconnected. 240225 1 figure 1-1. intel387 tm sx math coprocessor pinout table 1-1. pin cross referenceefunctional grouping busy y 36 d00 19 v cc 4v ss 5 n.c. 1 pereq 56 d01 20 9 14 10 error y 35 d02 23 13 21 17 d03 8 22 25 18 ads y 47 d04 7 26 27 52 cmd0 y 48 d05 6 31 32 65 nps1 y 44 d06 3 33 34 67 nps2 45 d07 2 37 38 68 sten 40 d08 24 39 42 w/r y 41 d09 28 43 55 ready y 49 d10 29 46 60 readyo y 57 d11 30 50 61 d12 16 58 63 d13 15 62 66 ckm 59 d14 12 64 cpuclk2 54 d15 11 numclk2 53 resetin 51 5 5
intel387 tm sx math coprocessor 1.1 pin description table the following table lists a brief description of each pin on the intel387 sx math coprocessor. for a more complete description refer to section 4.1 sig- nal description. the following definitions are used in these descriptions: y the signal is active low. i input signal o output signal i/o input and output signal symbol type name and function ads y i address strobe indicates that the address and bus cycle definition is valid. busy y o busy indicates that the math coprocessor is currently executing an instruction. ckm i clocking mode is used to select synchronous or asynchronous clock modes. cmd0 i command determines whether an opcode or operand are being sent to the math coprocessor. during a read cycle it indicates which register group is being read. cpuclk2 i cpu clock input provides the timing for the bus interface unit and the execution unit in synchronous mode. d15 d0 i/o data bus is used to transfer instructions and data between the math coprocessor and cpu. error y o error signals that an unmasked exception has occurred. nc e no connect should always remain unconnected. connection of a n.c. pin may cause the math coprocessor to malfunction or be incompatible with future steppings. nps1 y i npx select 1 is used to select the math coprocessor. nps2 i npx select 2 is used to select the math coprocessor. numclk2 i numerics clock is used in asynchronous mode to drive the floating point execution unit. pereq o processor extension request signals the cpu that the math coprocessor is ready for data transfer to/from its fifo. ready y i ready indicates that the bus cycle is being terminated. readyo y o ready out signals the cpu that the math coprocessor is terminating the bus cycle. resetin i system reset terminates any operation in progress and forces the math coprocessor to enter a dormant state. sten i status enable serves as a master chip select for the math coprocessor. when inactive, this pin forces all outputs and bi-directional pins into a floating state. w/r y i write/read indicates whether the cpu bus cycle in progress is a read or a write cycle. v cc i system power provides the a 5v nominal d.c. supply input. v ss i system ground provides the 0v connection from which all inputs and outputs are measured. 6 6
intel387 tm sx math coprocessor 2.0 functional description the intel387 sx math coprocessor is designed to support the intel386 sx microprocessor and effec- tively extend the cpu architecture by providing fast execution of arithmetic instructions and transcen- dental functions. this component contains internal power management circuitry for reduced active pow- er dissipation and an automatic idle mode. 2.1 feature list # new power saving design provides low power dissipation in active and idle modes. # higher performance, 10% 25% higher bench- mark performance than the original intel387 sx math coprocessor. # high performance 84-bit internal architecture # eight 80-bit numeric registers, usable as individ- ually addressable general registers or as a regis- ter stack. # full-range transcendental operations for sine, cosine, tangent, arctangent, and log- arithm. # programmable rounding modes and notification of rounding effects. # exception reporting either by software polling or hardware interrupts. # fully compatible with the sx microprocessors. # expands intel386 sx cpu data types to include 32-bit, 64-bit, and 80-bit floating point; 32-bit and 64-bit integers; and 18 digit bcd operands. # directly extends the intel386 sx cpu instruction set to trigonometric, logarithmic, exponential, and arithmetic functions for all data types. # operates independently of real, protected, and virtual-86 modes of the intel386 sx microproces- sors. # fully compatible with the intel387 sl mobile and dx math coprocessors. implements all intel387 math coprocessor architectural enhancements over 8087 and 80287. # implements ansi/ieee standard 754-1985 for binary floating point arithmetic. # upward object code compatible from 8087 and 80287. 2.2 math coprocessor architecture as shown in figure 2-1, the intel387 sx math co- processor is internally divided into four sections; the bus control logic, the data interface and control logic, the floating point unit, and the power man- agement unit. the bus control logic is responsible for the cpu bus tracking and interface. the data interface and control unit latches data and decodes instructions. the floating point unit executes the mathematical instructions. the power management unit is new to the intel387 family and is the nucleus 240225 2 figure 2-1. intel387 tm sx math coprocessor block diagram 7 7
intel387 tm sx math coprocessor of the static architecture. it is responsible for shut- ting down idle sections of the device to save power. microprocessor/math coprocessor interface the intel386 cpu interprets the pattern 11011b in most significant five bits of an instruction as an op- code intended for a math coprocessor. instructions thus marked are called escape or esc instruc- tions. upon decoding the instruction as an esc in- struction, the intel386 cpu transfers the opcode to the math coprocessor through an i/o write cycle at a dedicated address (8000f8h) outside the normal programmed i/o address range. the math coproc- essor has dedicated output signals for controlling the data transfer and notifying the cpu if the math coprocessor is busy or that a floating point error has occurred. 2.3 power management the intel387 sx math coprocessor offers two modes of power management; dynamic and idle. 2.3.1 dynamic mode dynamic mode is when the device is executing an instruction. using intel's chmos iv technology, the intel387 sx math coprocessor draws considerably less power than its predecessor. the active power supply current is reduced to approximately 100 ma at 20 mhz and provides low case temperatures. 2.3.2 idle mode when an instruction is not being executed, the intel387 sx math coprocessor will automatically change to idle mode . three clocks after completion of the previous instruction, the internal power man- ager shuts down the floating point execution unit and all non-essential circuitry. only portions of the bus interface unit remain active to monitor the cpu bus activity and to accept the next instruction when it is transferred. when the cpu transfers the next instruction to the math coprocessor, the intel387 sx math coprocessor accepts the instruction and ramps the internal core within one clock so there is no impact to performance or throughput. in idle mode, the intel387 sx math coprocessor draws typ- ically 4 ma of current and reduces case temperature to near ambient. note: in asynchronous clock mode (ckm e 0), the inter- nal idle mode is disabled. 2.4 compatibility the intel387 sx math coprocessor is compatible with the intel387 sl mobile math coprocessor. due to the increased performance and internal pipelining effects, diagnostic programs should never use in- struction execution time for test purposes. 2.5 performance the increased performance of floating point calcula- tions can be attributed to the 84-bit architecture and floating point processor. for the cpu to execute floating point calculations requires very long soft- ware emulation methods with reduced resolution and accuracy. the performance of the intel387 sx math coprocessor has been further enhanced through improvements in the internal microcode and through internal architectural changes. these refine- ments will increase whetstone benchmarks by ap- proximately 10% to 25% over the original intel387 sx math coprocessor. real performance, however, should be measured with application software. depending upon software coding, system overhead, and percentage of floating point instructions, performance can vary significant- ly. 8 8
intel387 tm sx math coprocessor 3.0 programming interface the intel387 sx math coprocessor effectively ex- tends to an intel386 microprocessor system addi- tional instructions, registers, data types, and inter- rupts specifically designed to facilitate high-speed floating point processing. all communication be- tween the cpu and the math coprocessor is trans- parent to applications software. the cpu automati- cally controls the math coprocessor whenever a numerics instruction is executed. all physical memo- ry and virtual memory of the cpu are available for storage of the instructions and operands of pro- grams that use the math coprocessor. all memory addressing modes, including use of displacement, base register, index register, and scaling are avail- able for addressing numerical operands. the intel387 sx math coprocessor is software com- patible with the intel387 dx math coprocessors and supports all applications written for the intel386 cpu and intel387 math coprocessors. 3.1 instruction set the intel386 cpu interprets the pattern 11011b in most significant five bits of an instruction as an op- code intended for a math coprocessor. instructions thus marked are called escape or esc instruction. the typical math coprocessor instruction accepts one or two operands and produces one or some- times two results. in two-operand instructions, one operand is the contents of the math coprocessor register, while the other may be a memory location. the operands of some instructions are predefined; for example, fsqrt always takes the square root of the number in the top stack element. the intel387 sx math coprocessor instruction set can be divided into six groups. the following sec- tions gives a brief description of each instruction. section 8.0 defines the instruction format and byte fields. further details can be obtained from the intel387 user's manual, programmer's reference, order y 231917. 3.1.1 data transfer instructions the class includes the operations that load, store, and convert operands of any support data types. real transfers fld load real (single, double, extended) fst store real (single, double) fstp store real and pop (single, double, ex- tended) fxch exchange registers integer transfers fild load (convert from) integer (word, short, long) fist store (convert to) integer (word, short) fistp store (convert to) integer and pop (word, short, long) packed decimal transfers fbld load (convert from) packed decimal fbstp store packed decimal and pop 3.1.2 arithmetic instructions this class of instructions provide variations on the basic add, subtract, multiply, and divide operations and a number of other basic arithmetic operations. operands may reside in registers or one operand may reside in memory. addition fadd add real faddp add real and pop fiadd add integer subtraction fsub subtract real fsubp subtract real and pop fisub subtract integer fsubr subtract real reversed fsubrp subtract real reversed and pop fisubr subtract integer reversed multiplication fmul multiply real fmulp multiply real and pop fimul multiply integer division fdiv divide real fdivp divide real and pop fidiv divide integer fdivr divide real reversed fdivrp divide real reversed and pop fidivr divide integer reversed 9 9
intel387 tm sx math coprocessor other operations fsqrt square root fscale scale fprem partial remainder fprem1 ieee standard partial remainder frndint round to integer fxtract extract exponent and significand fabs absolute value fchs change sign 3.1.3 comparison instruction instructions of this class allow comparison of num- bers of all supported real and integer data types. each of these instructions analyzes the top stack element often in relationship to another operand and reports the result as a condition code in the status word. fcom compare real fcomp compare real and pop fcompp compare real and pop twice fucom unordered compare real fucomp unordered compare real and pop fucompp unordered compare real and pop twice ficom compare integer ficomp compare integer and pop ftst test fxam examine 3.1.4 transcendental instructions this group of the intel387 operations includes trigo- nometric, inverse trigonometric, logarithmic and ex- ponential functions. the transcendental operate on the top one or two stack elements, and they return their results to the stack. the trigonometric opera- tions assume their arguments are expressed in radi- ans. the logarithmic and exponential operations work in base 2. fsin sine fcos cosine fsincos sine and cosine fptan tangent fpatan arctangent of st(1)/st f2xm1 2 x 1 fyl2x y * log 2 x fyl2xp1 y * log 2 (x a 1) 3.1.5 load constant instructions each of these instructions loads (pushes) a com- monly used constant onto the stack. the constants have extended real values nearest to the infinitely precise numbers. the only error that can be gener- ated is an invalid exception if a stack overflow oc- curs. fldz load a 0.0 fld1 load a 1.0 fldpi load q fldl2t load log 2 10 fldl2e load log 2 e fldlg2 load log 10 2 fldln2 load log e 2 10 10
intel387 tm sx math coprocessor 3.1.6 processor instructions (administrative) finit initialize math coprocessor fldcw load control word fstcw store control word fldcw load status word fstsw store status word fstsw ax store status word to ax register fclex clear exceptions fstenv store environment fldenv load environment fsave save state frstor restore state fincstp increment stack pointer fdecstp decrement stack pointer ffree free register fnop no operation fwait report math coprocessor error 3.2 register set figure 3-1 shows the intel387 sx math coprocessor register set. when a math coprocessor is present in a system, programmers may use these registers in addition to the registers normally available on the cpu. i386 tm microprocessor registers i387 tm math coprocessor data registers general registers 31 16 15 0 eax ax ah al ebx bx bh bl ecx cx ch cl edx dx dh dl esi si edi di ebp bp esp sp segment registers 15 0 cs ss ds es fs gs 31 0 eip eflags tag field 79 78 64 63 0 1 0 r0 sign exponent significand r1 r2 r3 r4 r5 r6 r7 15 0 control register status register tag word 47 0 instruction pointer (in cpu) data pointer (in cpu) l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l figure 3-1. intel386 tm cpu and intel387 tm math coprocessor register set 11 11
intel387 tm sx math coprocessor 3.2.1 status word (sw) register the 16-bit status word (in the status register) shown in figure 3-2 reflects the overall state of the math coprocessor. it can be read and inspected by pro- grams using the fstsw memory or fstsw ax in- structions. bit 15, the busy bit (b) is included for 8087 compati- bility only. it always has the same value as the error summary bit (es, bit 7 of status word); it does not indicate the status of the busy y output of the math coprocessor. bits 13 11 (top) serves as the pointer to the math coprocessor data register that is the current top-of- stack. the significance of the stack top is described in section 3.2.5 data registers. the four numeric condition code bits (c 3 c 0 , bit 14, 10 8) are similar to the flags in a cpu; instructions that perform arithmetic operations update these bits to reflect the outcome. the effects of the instruc- tions on the condition code are summarized in ta- bles 3-1 through 3-4. these condition code bits are used principally for conditional branching. the fstsw ax instructions stores the math coproces- sor status word directly to the cpu ax register, al- lowing the condition codes to be inspected efficient- ly by intel386 cpu code. the intel386 cpu sahf instruction can copy c 3 c 0 directly to the flag bits to simplify conditional branching. table 3-5 shows the mapping of these bits to the intel386 cpu flag bits. bit 7 is the error summary (es) status bit. this bit is set if any unmasked exception bit is set; it is clear otherwise. if this bit is set, the error y signal is asserted. bit 6 is the stack flag (sf). this bit is used to distin- guish invalid operations due to stack overflow or un- derflow from other kinds of invalid operations. when sf is set, bit 9 (c 1 ) distinguishes between stack overflow (c 1 e 1) or underflow (c 1 e 0). bit 5 0 are the six exception flags of the status word and are set to indicate that during an instruction exe- cution the math coprocessor has detected one of six possible exception conditions since these status bits were last cleared or reset. section 3.5 entitled exception handling explains how they are set and used. the exception flags are ``sticky'' bits and can only be cleared by the instructions finit, fclex, fldenv, fsave, and frstor. note that when a new value is loaded into the status word by the fldenv or frstor instruction, the value of es (bit 7) and b (bit 15) are not derived from the values loaded from memory but rather are dependent upon the values of the exception flags (bits 5 0) in the status word and their corresponding masks in the control word. if es is set in such a case, the error y output of the math coprocessor is acti- vated immediately. 240225 3 es is set if any unmasked exception bit is set; cleared otherwise. see table 2-2 for interpretation of condition code. top values: 000 e register 0 is top of stack 001 e register 1 is top of stack . . . 111 e register 7 is top of stack for definitions of exceptions, refer to the section entitled ``exception handling'' figure 3-2. status word 12 12
intel387 tm sx math coprocessor table 3-1. condition code interpretation instruction c0 (s) c3 (z) c1 (a) c2 (c) fprem, fprem1 three least significant bits reduction (see table 3-2) of quotient 0 e complete q2 q0 q1 1 e incomplete or o/u y fcom, fcomp, fcompp, ftst, result of comparison zero operand is not fucom, fucomp, (see table 3-3) or o/u y comparable fucompp, ficom, (table 3-3) ficomp fxam operand class sign operand class (see table 3-4) or o/u y (table 3-4) fchs, fabs, fxch, fincstp, fdecstp, zero constant loads, undefined undefined fxtract, fld, or o/u y fild, fbld, fstp (ext real) fist, fbstp, frndint, fst, fstp, fadd, fmul, roundup fdiv, fdivr, undefined undefined fsub, fsubr, or o/u y fscale, fsqrt, fpatan, f2xm1, fyl2x, fyl2xp1 fptan, fsin roundup reduction fcos, fsincos undefined or o/u y ,0 e complete undefined 1 e incomplete if c2 e 1 fldenv, frstor each bit loaded from memory fldcw, fstenv, fstcw, fstsw, undefined fclex, finit, fsave o/u y when both ie and sf bits of status word are set, indicating a stack exception, this bit distinguishes between stack overflow (c1 e 1) and underflow (c1 e 0). reduction if fprem or fprem1 produces a remainder that is less than the modulus, reduction is complete. when reduction is incomplete the value at the top of the stack is a partial remainder, which can be used as input to further reduction. for fptan, fsin, fcos, and fsincos, the reduction bit is set if the operand at the top of the stack is too large. in this case the original operand remains at the top of the stack. roundup when the pe bit of the status word is set, this bit indicates whether the last rounding in the instruction was upward. undefined do not rely on finding any specific value in these bits. 13 13
intel387 tm sx math coprocessor table 3-2. condition code interpretation after fprem and fprem1 instructions condition code interpretation after fprem and fprem1 c2 c3 c1 c0 incomplete reduction: 1 x x x further interation required for complete reduction q1 q0 q2 q mod8 000 0 010 1 complete reduction: 0 100 2 c0, c3, c1 contain three least 110 3 significant bits of quotient 001 4 011 5 101 6 111 7 table 3-3. condition code resulting from comparison order c3 c2 c0 top l operand 0 0 0 top k operand 0 0 1 top e operand 1 0 0 unordered 1 1 1 table 3-4. condition code defining operand class c3 c2 c1 c0 value at top 0000 a unsupported 0001 a nan 0010 b unsupported 0011 b nan 0100 a normal 0101 a infinity 0110 b normal 0111 b infinity 1000 a 0 1001 a empty 1010 b 0 1011 b empty 1100 a denormal 1110 b denormal table 3-5 mapping condition codes to intel386 tm cpu flag bits 240225 4 14 14
intel387 tm sx math coprocessor 3.2.2 control word (cw) register the math coprocessor provides the programmer with several processing options that are selected by loading a control word from memory into the control register. figure 3-3 show the format and encoding of fields in the control word. the low-order byte of the control word register is used to configure the exception masking. bits 5 0 of the control word contain individual masks for each of the six exceptions that the math coprocessor rec- ognizes. see section 3.5, exception handling, for further explanation on the exception control and def- inition. the high-order byte of the control word is used to configure the math coprocessor operating mode, in- cluding precision, rounding and infinity control. # the rounding control (rc) field (bits 11 10) pro- vide for directed rounding and true chop, as well as the unbiased round to nearest even mode specified in the ieee standard. rounding control affects only those instructions that perform rounding at the end of the operation (and thus can generate a precision exception); namely, fst, fstp, fist, all arithmetic instructions (ex- cept fprem, fprem1, fxtract, fabs, and fchs) and all transcendental instructions. # the precision control (pc) field (bits 9 8) can be used to set the math coprocessor internal oper- ating precision of the significand at less than the default of 64 bits (extended precision). this can be useful in providing compatibility with early gen- eration arithmetic processors of smaller preci- sion. pc affects only the instructions fadd, fsub(r), fmul, fdiv(r), and fsqrt. for all other instructions, either the precision is deter- mined by the opcode or extended precision is used. # the ``infinity control bit'' (bit 12) is not meaningful to the intel387 sx math coprocessor and pro- grams must ignore its value. to maintain compat- ibility with the 8087 and 80287 (non-387 core), this bit can be programmed, however, regardless of its value the intel387 sx math coprocessor always treats infinity in the affine sense ( b % k a % ). this bit is initialized to zero both after a hardware reset and after finit instruction. all other bits are reserved and should not be pro- grammed, to assure compatibility with future proces- sors. 240225 5 precision control 00e24 bits (single precision) 01e(reserved) 10e53 bits (double precision) 11e64 bits (extended precision) rounding control 00eround to nearest or even 01eround down (toward b % ) 10eround up (toward a % ) 11echop (truncate toward zero) figure 3-3. control word 15 15
intel387 tm sx math coprocessor 3.2.3 data register intel387 sx math coprocessor data register set consists of eight registers (r0 r7) which are treat- ed as both a stack and a general register file. each of these data registers in the math coprocessor is 80 bits wide and is divided into fields corresponding to the math coprocessor's extended-precision real data type, which is used for internal calculations. the math coprocessor register set can be accessed either as a stack, with instructions operating on the top one or two stack elements, or as individually ad- dressable registers. the top field in the status word identifies the current top-of-stack register. a ``push'' operation decrements top by one and loads a value into the new top register. a ``store and pop'' opera- tion stores the value from the current top register into memory and then increments top by one. the math coprocessor register stack grows ``down'' toward lower-addressed registers. most of the intel387 sx math coprocessor opera- tions use the register stack as the operand(s) and/or as a place to store the result. instructions may ad- dress the data register either implicitly or explicitly. many instructions operate on the register at the top of the stack. these instructions implicitly address the register at which top points. other instructions allow the programmer to explicitly specify which reg- ister to use. explicit register addressing is also rela- tive to top (where st denotes the current stack top and st(i) refers to the i'th register from the st in the stack so the real register address in computed as st a i). 3.2.4 tag word (tw) register the tag word marks the content of each numeric data register, as figure 3-4 shows. each two-bit tag represents one of the eight data register. the princi- pal function of the tag word is to optimize the math coprocessor's performance and stack handling by making it possible to distinguish between empty and non-empty register locations. it also enables excep- tion handlers to identify special values (e.g. nans or denormals) in the contents of a stack location with- out the need to perform complex decoding of the actual data. 3.2.5 instruction and data pointers because the math coprocessor operates in parallel with the cpu, any exceptions detected by the math coprocessor may be reported after the cpu has ex- ecuted the esc instruction which caused it. to allow identification of the numeric instruction which caused the exception, the intel386 microprocessor contains registers that aid in diagnosis. these regis- ters supply the address of the failing instruction and the address of its numeric memory operand (if ap- propriate). the instruction and data pointers are provided for user-written exception handlers. these registers are located in the cpu, but appear to be located in the math coprocessor because they are accessed by the esc instructions fldenv, fstenv, fsave, and frstor; which transfer the values between the registers and memory. whenever the cpu exe- cutes a new esc instruction (except administrative instructions), it saves the address of the instruction (including any prefixes that may be present), the ad- dress of the operand (if present) and the opcode. the instruction and data pointers appear in one of four formats depending on the operating mode of the cpu (protected mode or real-address mode) and depending on the operand size attribute in ef- fect (32-bit operand or 16-bit operand). (see figures 3-5, 3-6, 3-7, and 3-8.) note that the value of the data pointer is undefined if the prior esc instruction did not have a memory operand. 15 0 tag (7) tag (6) tag (5) tag (4) tag (3) tag (2) tag (1) tag (0) note: the index i of tag(i) is not top-relative. a program typically uses the ``top'' field of status word to determine which tag(i) field refers to logical top of stack. tag values: 00 e valid 01 e zero 10 e qnan, snan, infinity, denormal and unsupported formats 11 e empty figure 3-4. tag word register 16 16
intel387 tm sx math coprocessor 32-bit protected mode format 31 23 15 7 0 reserved control word 0 reserved status word 4 reserved tag word 8 ip offset c 00000 opcode 10..0 cs selector 10 data operand offset 14 reserved operand selector 18 figure 3-5. instruction and data pointer image in memory, 32-bit protected-mode format 16-bit protected mode format 15 7 0 control word 0 status word 2 tag word 4 ip offset 6 cs selector 8 operand offset a operand selector c figure 3-6. instruction and data pointer image in memory, 16-bit protected-mode format 32-bit real-address mode format 31 23 15 7 0 reserved control word 0 reserved status word 4 reserved tag word 8 reserved instruction pointer 15..0 c 0000 instruction pointer 31..16 0 opcode 10..0 10 reserved operand pointer 15..0 14 0000 operand pointer 31..16 0000 00000000 18 figure 3-7. instruction and data pointer image in memory, 32-bit real-mode format 17 17
intel387 tm sx math coprocessor 16-bit real-address mode and virtual 8086 mode format 15 7 0 control word 0 status word 2 tag word 4 instruction pointer 15..0 6 ip19.16 0 opcode 10..0 8 operand pointer 15..0 a dp 19.16 00000000000 0 c figure 3-8. instruction and data pointer image in memory, 16-bit real-mode format 3.3 data types table 3-6 lists the seven data types that the math coprocessor supports and presents the format for each type. operands are stored in memory with the least significant digit at the lowest memory address. programs retrieve these values by generating the lowest address. for maximum system performance, all operands should start at physical-memory ad- dresses that correspond to the word size of the cpu; operands may begin at any other addresses, but will require extra memory cycles to access the entire operand. the data type formats can be divided into three classes: binary integer, decimal integer, and binary real. these formats, however, exist in memory only. internally, the math coprocessor holds all numbers in the extended-precision real format. instructions that load operands from memory automatically con- vert operands represented in memory as 16, 32, or 64-bit integers, 32 or 64-bit floating point numbers, or 18 digit packed bcd numbers into extended-pre- cision real format. instructions that store operands in memory perform the inverse type conversion. in addition to the typical real and integer data values, the intel387 sx math coprocessor data formats en- compass encodings for a variety of special values. these special values have significance and can ex- press relevant information about the computations or operations that produced them. the various types of special values are denormal real numbers, zeros, positive and negative infinity, nans (not-a-number), indefinite, and unsupported formats. for further in- formation on data types and formats, see the in- tel387 programmer's reference manual. 3.4 interrupt description cpu interrupts are used to report errors or excep- tional conditions while executing numeric programs in either real or protected mode. table 3-7 shows these interrupts and their functions. 3.5 exception handling the math coprocessor detects six different excep- tion conditions that occur during instruction execu- tion. table 3-8 lists the exception conditions in order of precedence, showing for each the cause and the default action taken by the math coprocessor if the exception is masked by its corresponding mask bit in the control word. any exception that is not masked by the control word sets the corresponding exception flag of the status word, sets the es bit of the status word, and asserts the error y signal. when the cpu at- tempts to execute another esc instruction or wait, exception 16 occurs. the exception condition must be resolved via an interrupt service routine. the re- turn address pushed onto the cpu stack upon entry to the service routine does not necessarily point to the failing instruction nor to the following instruction. the cpu saves the address of the floating-point in- struction that caused the exception and the address of any memory operand required by that instruction. 18 18
intel387 tm sx math coprocessor table 3-6. intel387 tm sx math coprocessor data type representation in memory 240225 23 notes: 1. s e sign bit (0 e positive, 1 e negative) 2. d n e decimal digit (two per byte) 3. x e bits have no significance; math coprocessor ignores when loading, zeros when storing 4. u e position of implicit binary point 5. i e integer bit of significand; stored in temporary real, implicit in single and double precision 6. exponent bias (normalized values): single: 127 (7fh) double: 1023 (3ffh) extended real: 16383 (3fffh) 7. packed bcd: ( b 1) s (d 17 ..d 0 ) 8. real: ( b 1) s (2 e-bias )(f 0 f 1 ...) 19 19
intel387 tm sx math coprocessor table 3-7. cpu interrupt vectors reserved for math coprocessor interrupt cause of interrupt number 7 an esc instruction was encountered when em or ts of cpu control register zero (cr0) was set. em e 1 indicates that software emulation of the instruction is required. when ts is set, either an esc or wait instruction causes interrupt 7. this indicates that the current math coprocessor context may not belong to the current task. 9 in a protected-mode system, an operand of a coprocessor instruction wrapped around an addressing limit (0ffffh for expand-up segments, zero for expand-down segments) and spanned inaccessible addresses (1) . the failing numerics instruction is not restartable. the address of the failing numerics instruction and data operand may be lost; an fstenv does not return reliable addresses. the segment overrun exception should be handled by executing an fninit instruction (i.e., an finit without a preceding wait). the exception can be avoided by never allowing numerics operands to cross the end of a segment. 13 in a protected-mode system, the first word of a numeric operand is not entirely within the limit of its segment. the return address pushed onto the stack of the exception handler points at the esc instruction that caused the exception, including any prefixes. the math coprocessor has not executed this instruction; the instruction pointer and data pointer register refer to a previous, correctly executed instruction. 16 the previous numerics instruction caused an unmasked exception. the address of the faulty instruction and the address of its operand are stored in the instruction pointer and data pointer registers. only esc and wait instructions can cause this interrupt. the cpu return address pushed onto the stack of the exception handler points to a wait or esc instruction (including prefixes). this instruction can be restarted after clearing the exception condition in the math coprocessor. fninit, fnclex, fnstsw, fnstenv, and fnsave cannot cause this interrupt. note: 1. an operand may wrap around an addressing limit when the segment limit is near an addressing limit and the operand is near the largest valid address in the segment. because of the wrap-around, the beginning and ending addresses of such an operand will be at opposite ends of the segment. there are two ways that such an operand may also span inaccessible addresses: 1) if the segment limit is not equal to the addressing limit (e.g. addressing limit is ffffh and segment limit is fffdh) the operand will span addresses that are not within the segment (e.g. an 8-byte operand that starts at valid offset fffch will span addresses fffc ffffh and 0000-0003h; however addresses fffeh and ffffh are not valid, because they exceed the limit); 2) if the operand begins and ends in present and accessible segments but intermediate bytes of the operand fall in a not-present page or in a segment or page to which the procedure does not have access rights. table 3-8. intel387 tm sx math coprocessor exceptions exception cause default action (if exception is masked) invalid operation on a signalling nan, unsupported format, result is a quiet nan, indeterminate for (0- % , 0/0, ( a % ) a ( b % ), etc.), or stack integer indefinite, or operation overflow/underflow (sf is also set). bcd indefinte denormalized at least one of the operands is denormalized, i.e., it has the normal processing smallest exponent but a nonzero significand. continues operand zero divisor the divisor is zero while the dividend is a noninfinite, nonzero result is % number. overflow the result is too large in magnitude to fit in the specified format. result is largest finite value or % underflow the true result is nonzero but too small to be represented in the result is denormalized specified format, and, if underflow exception is masked, or zero denormalization causes the loss of accuracy. inexact result the true result is not exactly representable in the specified normal processing format (e.g. 1/3); the result is rounded according to the rounding continues (precision) mode. 20 20
intel387 tm sx math coprocessor 3.6 initialization after fninit or reset, the control word contains the value 037fh (all exceptions masked, precision control 64 bits, rounding to nearest) the same values as in an intel287 after reset. for compatibility with the 8087 and intel287, the bit that used to indicate infinity control (bit 12) is set to zero; however, re- gardless of its setting, infinity is treated in the affine sense. after fninit or reset, the status word is initialized as follows: # all exceptions are set to zero. # stack top is zero, so that after the first push the stack top will be register seven (111b). # the condition code c 3 c 0 is undefined. # the b-bit is zero. the tag word contains ffffh (all stack locations are empty). the intel386 microprocessor and intel387 math co- processor initialization software must execute a fninit instruction (i.e., finit without a preceding wait) after reset. the fninit is not strictly re- quired for the intel386 software, but intel recom- mends its use to help ensure upware compatibility with other processors. after a hardware reset, the error y output is asserted to indicate that an in- tel387 math coprocessor is present. to accomplish this, the ie (invalid exception) and es (error sum- mary) bits of the status word are set, and the im bit (invalid exception mask) in the control word is cleared. after fninit, the status word and the con- trol word have the same values as in an intel287 math coprocessor after reset. 3.7 processing modes the intel387 sx math coprocessor works the same whether the cpu is executing in real-addressing mode, protected mode, or virtual-8086 mode. all ref- erences to memory for numerics data or status infor- mation are performed by the cpu, and therefore obey the memory-management and protection rules of the cpu mode currently in effect. the intel387 sx math coprocessor merely operates on instruc- tions and values passed to it by the cpu and there- fore is not sensitive to the processing mode of the cpu. the real-address mode and virtual-8086 mode, the intel387 sx math coprocessor is completely upward compatible with software for the 8086/8087 and 80286/80287 real-address mode systems. in protected mode, the intel387 sx math coproces- sor is completely upward compatible with software for the 80286/80287 protected mode system. the only differences of operation that may appear when 8086/8087 programs are ported to the pro- tected mode (not using virtual-8086 mode) is in the format of operands for the administrative instruc- tions fldenv, fstenv, frstor, and fsave. 3.8 programming support using the intel387 sx math coprocessor requires no special programming tools, because all new in- structions and data types are directly supported by the assembler and compilers for high-level lan- guages. all intel386 microprocessor development tools that support intel387 math coprocessor pro- grams can also be used to develop software for the intel386 sx microprocessors and intel387 sx math coprocessors. all 8086/8088 development tools that support the 8087 can also be used to develop software for the cpu and math coprocessor in real- address mode or virtual-8086 mode. all 80286 de- velopment tools that support the intel287 math co- processor can also be used to develop software for the intel386 cpu and intel387 math coprocessor. 4.0 hardware system interface in the following description of hardware interface, the y symbol at the end of a signal name indicates that the active or asserted state occurs when the signal is at a low voltage. when no y is present after the signal name, the signal is asserted when at the high voltage level. 21 21
intel387 tm sx math coprocessor 4.1 signal description in the following signal descriptions, the intel387 sx math coprocessor pins are grouped by function as shown by table 4-1. table 4-1 lists every pin by its identifier, gives a brief description and lists some of its characteristics (refer to figure 1-1 and table 1-1 for pin configuration). all output signals can be tri-stated by driving sten inactive. the output buffers of the bi-directional data pins d15 d0 are also tri-state; they only leave the floating state during read cycles when the math co- processor is selected. 4.1.1 intel386 cpu clock 2 (cpuclk2) this input uses the clk2 signal of the cpu to time the bus control logic. several other math coproces- sor signals are referenced to the rising edge of this signal. when ckm e 1 (synchronous mode) this pin also clocks the data interface and control unit and the floating point unit of the math coprocessor. this pin requires cmos-level input. the signal on this pin is divided by two to produce the internal clock signal clk. 4.1.2 intel387 math coprocessor clock 2 (numclk2) when ckm e 0 (asynchronous mode), this pin pro- vides the clock for the data interface and control unit and the floating point unit of the math coprocessor. in this case, the ratio of the frequency of numclk2 to the frequency of cpuclk2 must lie within the range 10:16 to 14:10 and the maximum frequency must not exceed the device specifications. when ckm e 1 (synchronous mode), signals on this pin are ignored: cpuclk2 is used instead for the data interface and control unit and the floating point unit. this pin requires cmos level input and should be tied low if not used. table 4-1. pin summary pin name function active state input/ referenced output t o... execution control cpuclk2 microprocessor clock2 i numclk2 math coprocessor clock2 i ckm math coprocessor clock mode i resetin system reset high i cpuclk2 math coprocessor handshake pereq processor request high o cpuclk2 busy y busy status low o cpuclk2 error y error status low o numclk2 bus interface d15 d0 data pins i/o cpuclk2 w/r y write/read bus cycle high/low i cpuclk2 ads y address strobe low i cpuclk2 ready y bus ready input low i cpuclk2 readyo y ready output low o cpuclk2 chip/port select sten status enable high i cpuclk2 nps1 y numerics select y 1 low i cpuclk2 nps2 numerics select y 2 high i cpuclk2 cmd0 y command low i cpuclk2 power and ground v cc system power v ss system ground 22 22
intel387 tm sx math coprocessor 4.1.3 clocking mode (ckm) this pin is strapping option. when it is strapped to v cc (high), the math coprocessor operates in syn- chronous mode; when strapped to v ss (low), the math coprocessor operates in asynchronous mode. these modes relate to clocking of the internal data interface and control unit and the floating point unit only; the bus control logic always operates synchro- nously with respect to the cpu. synchronous mode requires the use of only one clock, the cpu's clk2. use of synchronous mode eliminates one clock generator from the board design and is recommended for all designs. syn- chronous mode also allows the internal power man- agement unit to enable the idle and standby power saving modes. asynchronous mode can provide higher perform- ance of the floating point unit by running a faster clock on numclk2. (the cpu's clk2 must still be connected to cpuclk2 input.) this allows the float- ing point unit to run up to 40% faster than in syn- chronous mode. internal power management is dis- abled in asynchronous mode. 4.1.4 system reset (resetin) a low to high transition on this pin causes the math coprocessor to terminate its present activity and to enter a dormant state. resetin must remain active (high) for at least 40 cpuclk2 (numclk2 if ckm e 0) periods. the high to low transitions of resetin must be synchronous with cpuclk2, so that the phase of the internal clock of the bus control logic (which is the cpuclk2 divided by two) is the same as the phase of the internal clock of the cpu. after resetin goes low, at least 50 cpuclk2 (numclk2 if ckm e 0) periods must pass before the first math coprocessor instruction is written into the math coprocessor. this pin should be connect- ed to the cpu reset pin. table 4-2 shows the status of the output pins during the reset sequence. after a reset, all output pins return to their inactive state except for error y which remains active (for cpu recognition) until cleared. table 4-2. output pin status during reset pin value pin name high readyo y , busy y low pereq, error y tri-state off d15 d0 4.1.5 processor request (pereq) when active, this pin signals to the cpu that the math coprocessor is ready for data transfer to/from its data fifo. when all data is written to or read from the data fifo, pereq is deactivated. this sig- nal always goes inactive before busy y goes inac- tive. this signal is reference to cpuclk2. it should be connected to the cpu pereq input pin. 4.1.6 busy status (busy y ) when active, this pin signals to the cpu that the math coprocessor is currently executing an instruc- tion. this signal is referenced to cpuclk2. it should be connected to the cpu busy y input pin. 4.1.7 error status (error y ) this pin reflects the es bit of the status register. when active, it indicates that an unmasked excep- tion has occurred. this signal can be changed to the inactive state only by the following instructions (with- out a preceding wait); fninit, fnclex, fnstenv, fnsave, fldcw, fldenv, and frstor. error y is driven active during reset to indicate to the cpu that the math coprocessor is present. this pin is referenced to numclk2 (or cpuclk2 if ckm e 1). it should be connected to the error y pin of the cpu. 4.1.8 data pins (d15 d0) these bi-directional pins are used to transfer data and opcodes between the cpu and math coproces- sor. they are normally connected directly to the cor- responding cpu data pins. high state indicates a value of one. d0 is the least significant data bit. tim- ings are referenced to rising edge of cpuclk2. 4.1.9 write/read bus cycle (w/r y ) this signal indicates to the math coprocessor whether the cpu bus cycle in progress is a read or a write cycle. this pin should be connected directly to the cpu's w/r y pin. high indicates a write cycle to the math coprocessor; low a read cycle from the math coprocessor. this input is ignored if any of the signals sten, nps1 y , or nps2 are inactive. setup and hold times are referenced to cpuclk2. 4.1.10 address strobe (ads y ) this input, in conjunction with the ready y input, indicates when the math coprocessor bus control logic may sample w/r y and the chip select signals. setup and hold times are referenced to cpuclk2. this pin should be connected to the ads y pin of the cpu. 23 23
intel387 tm sx math coprocessor 4.1.11 bus ready input (ready y ) this input indicates to the math coprocessor when a cpu bus cycle is to be terminated. it is used by the bus control logic to trace bus activities. bus cycles can be extended indefinitely until terminated by ready y . this input should be connected to the same signal that drives the cpu's ready y input. setup and hold times are referenced to cpuclk2. 4.1.12 ready output (readyo y ) this pin is activated at such a time that write cycles are terminated after two clocks (except fldenv and frstor) and read cycles after three clocks. in configurations where no extra wait states are re- quired, this pin must directly or indirectly drive the ready y input of the cpu. refer to the section enti- tled ``bus operation'' for details. this pin is acti- vated only during bus cycles that select the math coprocessor. this signal is referenced to cpuclk2. (fldenv and frstor require data transfers larger than the fifo. therefore, pereq is activated for the duration of transferring 2 words of 32 bits and then deactivated until the fifo is ready to accept two additional words. the length of the write cycles of the last operand word in each transfer as well as the first operand word transfer of the entire instruc- tion is 3 clocks instead of 2 clocks. this is done to give the intel386 cpu enough time to sample pereq and to notice that the intel387 is not ready for additional transfers.) 4.1.13 status enable (sten) this pin serves as a chip select for the math co- processor. when inactive, this pin forces busy y , pereq, error y and readyo y outputs into a floating state. d15 d0 are normally floating and will leave the floating state only if sten is active and additional conditions are met (read cycle). sten also causes the chip to recognize its other chip se- lect inputs. sten makes it easier to do on-board testing (using the overdrive method) of other chips in systems containing the math coprocessor. sten should be pulled up with a resistor so that it can be pulled down when testing. in boards that do not use on-board testing sten should be connected to v cc . setup and hold times are relative to cpuclk2. note that sten must maintain the same setup and hold times as nps1 y , nps2, and cmd0 y (i.e., if sten changes state during a math coprocessor bus cycle, it must change state during the same clk period as the nps1 y , nps2, and cmd0 y signals). 4.1.14 math coprocessor select 1 (nps1 y ) when active (along with sten and nps2) in the first period of a cpu bus cycle, this signal indicates that the purpose of the bus cycle is to communicate with the math coprocessor. this pin should be connect- ed directly to the m/io y pin of the cpu, so that the math coprocessor is selected only when the cpu performs i/o cycles. setup and hold times are refer- enced to the rising edge of cpuclk2. 4.1.15 math coprocessor select 2 (nps2) when active (along with sten and nps1 y )inthe first period of a cpu bus cycle, this signal indicates that the purpose of the bus cycle is to communicate with the math coprocessor. this pin should be con- nected directly to the a23 pin of the cpu, so that the math coprocessor is selected only when the cpu issues one of the i/o addresses reserved for the math coprocessor (8000f8h, 8000fch, or 8000feh which is treated as 8000fch by the math coproces- sor). setup and hold times are referenced to the ris- ing edge of cpuclk2. 4.1.16 command (cmd0 y ) during a write cycle, this signal indicates whether an opcode (cmd0 y active low) or data (cmd0 y inac- tive high) is being sent to the math coprocessor. during a read cycle, it indicates whether the control or status register (cmd0 y active) or a data register (cmd0 y ) is being read. cmd0 y should be connect- ed directly to the a2 output of the cpu. setup and hold times are referenced to the rising edge of cpuclk2 at the end of ph2. 4.1.17 system power (v cc ) system power provides the a 5v dc supply input. all v cc pins should be tied together on the circuit board and local decoupling capacitors should be used between v cc and v ss . 4.1.18 system ground (v ss ) system ground provides the 0v connection from which all inputs and outputs are measured. all v ss pins should be tied together on the circuit board and local decoupling capacitors should be used between v cc and v ss . 24 24
intel387 tm sx math coprocessor 4.2 system configuration the intel387 sx math coprocessor is designed to interface with the intel386 sx microprocessor as shown by figure 4-1. a dedicated communication protocol makes possible high-speed transfer of op- codes and operands between the cpu and math coprocessor. the intel387 sx math coprocessor is designed so that no additional components are re- quired for interface with the cpu. most control pins of the math coprocessor are connected directly to pins of the cpu. the interface between the math coprocessor and the cpu has these characteristics: # the math coprocessor shares the local bus of the intel386 sx microprocessor. # the cpu and math coprocessor share the same reset signals. they may also share the same clock input; however, for greatest performance, an external oscillator may be needed. # the corresponding busy y , error y , and pereq pins are connected together. # the math coprocessor nps1 y and nps2 inputs are connected to the latched cpu m/io y and a23 outputs respectively. for math coprocessor cycles, m/io y is always low and a23 always high. # the math coprocessor input cmd0 is connected to the latched a 2 output. the intel386 sx micro- processor generates address 8000f8h when writing a command and address 8000fch or 8000feh (treated as 8000fch by the intel387 sx math coprocessor) when writing or reading data. it does not generate any other addresses during math coprocessor bus cycles. 240225 6 figure 4-1. intel386 tm sx cpu and intel387 tm sx math coprocessor system configuration 25 25
intel387 tm sx math coprocessor 4.3 math coprocessor architecture as shown in figure 2-1 block diagram, the intel387 sx math coprocessor is internally divided into four sections; the bus control logic (bcl), the data in- terface and control logic, the floating point unit (fpu), and the power management unit (pmu). the bus control logic is responsible for the cpu bus tracking and interface. the bcl is the only unit in the math coprocessor that must run synchronously with the cpu; the rest of the math coprocessor can run asynchronously with respect to the cpu. the data interface and control unit is responsible for the data flow to and from the fpu and the control regis- ters, for receiving the instructions, decoding them, sequencing the microinstructions, and for handling some of the administrative instructions. the floating point unit (with the support of the control unit which contains the sequencer and other support units) ex- ecutes the mathematical instructions. the power manager is new to the intel387 family. it is responsi- ble for shutting down idle sections of the device to save power. 4.3.1 bus control logic the bcl communicates solely with the cpu using i/o bus cycles. the bcl appears to the cpu as a special peripheral device. it is special in two re- spects: the cpu initiates i/o automatically when it encounters esc instructions, and the cpu uses re- served i/o addresses to communicate with the bcl. the bcl does not communicate directly with memo- ry. the cpu performs all memory access, transfer- ring input operands from the memory to the math coprocessor and transferring outputs from the math coprocessor to memory. 4.3.2 data interface and control unit the data interface and control unit latches the data and, subject to bcl control, directs the data to the fifo or the instruction decoder. the instruction de- coder decodes the esc instructions sent to it by the cpu and generates controls that direct the data flow in the fifo. it also triggers the microinstruction se- quencer that controls execution of each instruction. if the esc instruction is finit, fclex, fstsw, fstsw ax, fstcw, fsetpm, or frstpm, the control unit executes it independently of the fpu and the sequencer. the data interface and control unit is the unit that generates the busy y , pereq, and error y signals that synchronize the math coprocessor activities with the cpu. 4.3.3 floating point unit the fpu executes all instructions that involve the register stack, including arithmetic, logical, transcen- dental, constant, and data transfer instructions. the data path in the fpu is 84 bits wide (68 significant bits, 15 exponent bits, and a sign bit) which allows internal operand transfers to be performed at very high speeds. 4.3.4 power management unit the power management unit (pmu) controls all in- ternal power savings circuits. when the math co- processor is not executing an instruction, the pmu disables the internal clock to the fpu, control unit, and data interface within three clocks. the bus control logic remains enabled to accept the next instruction. upon decode of a valid math coproces- sor bus cycle, the pmu enables the internal clock to all circuits. no loss in performance occurs. 4.4 bus cycles all bus cycles are initiated by the cpu. the pins sten, nps1 y , nps2, cmd0, and w/r y identify bus cycles for the math coprocessor. table 4-3 de- fines the types of math coprocessor bus cycles. table 4-3. bus cycle definition sten nps1 y nps2 cmd0 y w/r y bus cycle type 0 x x x x math coprocessor not selected and all outputs in floating state 1 1 x x x math coprocessor not selected 1 x 0 x x math coprocessor not selected 1 0 1 0 0 cw or sw read from math coprocessor 1 0 1 0 1 opcode write to math coprocessor 1 0 1 1 0 data read from math coprocessor 1 0 1 1 1 data write to math coprocessor 26 26
intel387 tm sx math coprocessor 4.4.1 intel387 sx math coprocessor addressing the nps1 y , nps2, and cmd0 signals allow the math coprocessor to identify which bus cycles are intended for the math coprocessor. the math co- processor responds to i/o cycles when the i/o ad- dress is 8000f8h, 8000fch, and 8000feh (treated as 8000fch). the math coprocessor responds to i/o cycles when bit 23 of the i/o address is set. in other words, the math coprocessor acts as an i/o device in a reserved i/o address space. because a23 is used to select the intel387 sx math coprocessor for data transfers, it is not possible for a program running on the cpu to address the math coprocessor with an i/o instruction. only esc in- structions cause the cpu to communicate with the math coprocessor. 4.4.2 cpu/math coprocessor synchronization the pins busy y , pereq, and error y are used for various aspects of synchronization between the cpu and the math coprocessor. busy y is used to synchronize instruction transfer from the cpu to the math coprocessor. when the math coprocessor recognizes an esc instruction it asserts busy y . for most esc instructions, the cpu waits for the math coprocessor to deassert busy y before sending the new opcode. the math coprocessor uses the pereq pin of the cpu to signal that the math coprocessor is ready for data transfer to or from its data fifo. the math coprocessor does not directly access memory; rath- er, the cpu provides memory access services for the math coprocessor. (for this reason, memory ac- cess on behalf of the math coprocessor always obeys the protection rules applicable to the current cpu mode.) once the cpu initiates an math co- processor instruction that has operands, the cpu waits for pereq signals that indicate when the math coprocessor is ready for operand transfer. once all operands have been transferred (or if the instruction has no operands) the cpu continues program exe- cution while the math coprocessor executes the esc instruction. in 8087/8087 systems, wait instructions may be required to achieve synchronization of both com- mands and operands. in the intel386 micropro- cessor and intel387 math coprocessor systems, however, wait instructions are required only for op- erand synchronization; namely, after math coproc- essor stores to memory (except fstsw and fstcw) or load from memory. (in 80286/80287 systems, wait is required before fldenv and frstor.) used this way, wait ensures that the value has already been written or read by the math coprocessor before the cpu reads or changes the value. once it has started to execute a numerics instruction and has transferred and operands from the cpu, the math coprocessor can process the instruction in parallel with and independent of the host cpu. when the math coprocessor detects an exception, it asserts the error y signal, which causes a cpu interrupt. 4.4.3 synchronous/asynchronous modes the internal logic of the math coprocessor can op- erate either directly from the cpu clock (synchro- nous mode) or from a separate clock (asynchronous mode). the two configurations are distinguished by the ckm pin. in either case, the bus control logic (bcl) of the math coprocessor is synchronized with the cpu clock. use of asynchronous mode allows the bcl and the fpu section of the math coproces- sor to run at different speeds. in this case, the ratio of the frequency of numclk2 to the frequency of cpuclk2 must lie within the range 10:16 to 14:10. use of synchronous mode eliminates one clock gen- erator from the board design. the internal power management unit of the intel387 sx math coproc- essor is disabled in asynchronous mode. 4.4.4 automatic bus cycle termination in configurations where no extra wait states are re- quired, readyo y can drive the cpu's ready y input and the math coprocessors ready y input. if wait states are required, this pin should be connect- ed to the logic that ors all ready outputs from peripheral devices on the cpu bus. readyo y is asserted by the math coprocessor only during i/o cycles that select the math coprocessor. refer to section 5.0 bus operation for details. 5.0 bus operation with respect to bus interface, the intel387 sx math coprocessor is fully synchronous with the cpu. both operate at the same rate because each gener- ates its internal clk signal by dividing cpuclk2 by two. furthermore, both internal clk signals are in phase, because they are synchronized by the same resetin signal. a bus cycle for the math coprocessor starts when the cpu activates ads y and drives new values on the address and cycle definition lines (w/r y , m/io y , etc.). the math coprocessor examines the address and cycle definition lines in the same clk period during which ads y is activated. this clk period is considered the first clk of the bus cycle. 27 27
intel387 tm sx math coprocessor during this first clk period, the math coprocessor also examines the w/r y input signal to determine whether the cycle is a read or a write cycle and ex- amines the cmd0 y input to determine whether an opcode, operand, or control/status register transfer is to occur. the intel387 sx math coprocessor supports both pipelined (i.e., overlapped) and non-pipelined bus cycles. a non-pipelined cycle is one for which the cpu asserts ads y when no other bus cycle is in progress. a pipelined bus cycle is one for which the cpu asserts ads y and provides valid next address and control signals before the prior math coproces- sor cycle terminates. the cpu may do this as early as the second clk period after asserting ads y for the prior cycle. pipelining increases the availability of the bus by at least one clk period. the intel387 sx math coprocessor supports pipelined bus cycles in order to optimize address pipelining by the cpu for memory cycles. bus operation is described in terms of an abstract state machine. figure 5-1 illustrates the states and state transitions for math coprocessor bus cycles: # t i is the idle state. this is the state of the bus logic after reset, the state to which bus logic returns after every non-pipelined bus cycle, and the state to which bus logic returns after a series of pipelined cycles. # t rs is the ready y sensitive state. different types of bus cycles may require a minimum of one or two successive t rs states. the bus logic remains in t rs state until ready y is sensed, at which point the bus cycle terminates. any number of wait states may be implemented by delaying ready y , thereby causing additional successive t rs states. # t p is the first state for every pipelined bus cycle. this state is not used by non-pipelined cycles. note that the bus logic tracks bus state regardless of the values on the chip/port select pins. the 240225 7 figure 5-1. bus state diagram readyo y output of the math coprocessor indi- cates when a math coprocessor bus cycle may be terminated if no extra wait states are required. for all write cycles (except those for the instructions fldenv and frstor), readyo y is always as- serted during the first t rs state, regardless of the number of wait states. for all read cycles (and write cycles for fldenv and frstor), ready y is al- ways asserted in the second t rs state, regardless of the number of wait states. these rules apply to both pipelined and non-pipelined cycles. systems designers may use readyo y in one of the follow- ing ways: 1. connect it (directly or through logic that ors ready y signals from other devices) to the ready y inputs of the cpu and math coproces- sor. 2. use it as one input to a wait-state generator. the following sections illustrate different types of intel387 sx math coprocessor bus cycles. because different instructions have different amounts of over- head before, between, and after operand transfer cycles, it is not possible to represent in a few dia- grams all of the combinations of successive operand transfer cycles. the following bus cycle diagrams show memory cycles between math coprocessor operand transfer cycles. note however that, during frstor, some consecutive accesses to the math coprocessor do not have intervening memory ac- cesses. for the timing relationship between operand transfer cycles and opcode write or other overhead activities, see figure 7-7 ``other parameters''. 5.1 non-pipelined bus cycles figure 5-2 illustrates bus activity for consecutive non-pipelined bus cycles. at the second clock of the bus cycle, the math co- processor enters the t rs state. during this state, it samples the ready y input and stays in this state as long as ready y is inactive. 5.1.1 write cycle in write cycles, the math coprocessor drives the readyo y signal for one clk period during the second clk period of the cycle (i.e., the first t rs state); therefore, the fastest write cycle takes two clk periods (see cycle 2 of figure 5-2). for the in- structions fldenv and frstor, however, the math coprocessor forces wait state by delaying the activation of readyo y to the second t rs state (not shown in figure 5-2). the math coprocessor samples the d15 d0 inputs into data latches at the falling edge of clk as long as it stays in t rs state. 28 28
intel387 tm sx math coprocessor 240225 8 cycle s1&2 represent part of the operand transfer cycle for instructions involving either 4-byte or 8-byte operand loads. cycle s3&4 represent part of the operand transfer cycle for a store operation. * cycle s1&2 could repeat here or t i states for various non-operand transfer cycles and overhead. figure 5-2. non-pipelined read and write cycles when ready y is asserted, the math coprocessor returns to the idle state. simultaneously with the math coprocessor entering the idle state, the cpu may assert ads y again, signaling the beginning of yet another cycle. 5.1.2 read cycle at the rising edge of clk in the second clk period of the cycle (i.e., the first t rs state), the math co- processor starts to drive the d15 d0 outputs and continues to drive them as long as it stays in t rs state. at least one wait state must be inserted to ensure that the cpu latches the correct data. because the math coprocessor starts driving the data bus only at the rising edge of clk in the second clock period of the bus cycle, not enough time is left for the data signals to propagate and be latched by the cpu be- fore the next falling edge of clk. therefore, the math coprocessor does not drive the readyo y signal until the third clk period of the cycle. thus, if the readyo y output drives the cpu's ready y input, one wait state is automatically inserted. because one wait state is required for math coproc- essor reads, the minimum length of a math coproc- essor read cycle is three clk periods, as cycle 3 of figure 5-2 shows. when ready y is asserted, the math coprocessor returns to the idle state. simultaneously with the math coprocessor's entering the idle state, the cpu may assert ads y again, signaling the beginning of yet another cycle. the transition from t rs state to idle state causes the math coprocessor to put the d15 d0 outputs into the floating state, allowing an- other device to drive the data bus. 5.2 pipelined bus cycles because all the activities of the math coprocessor bus interface occur either during the t rs state or 29 29
intel387 tm sx math coprocessor during the transitions to or from that state, the only difference between a pipelined and a non-pipelined cycle is the manner of changing from one state to another. the exact activities during each state are detailed in the previous section ``non-pipelined bus cycles''. when the cpu asserts ads y before the end of a bus cycle, both ads y and ready y are active dur- ing a t rs state. this condition causes the math co- processor to change to a different state named t p . one clock period after a t p state, the math coproc- essor always returns to the t rs state. in consecu- tive pipelined cycles, the math coprocessor bus log- ic uses only the t rs and t p states. figure 5-3 shows the fastest transitions into and out of the pipelined bus cycles. cycle 1 in the figure rep- resents a non-pipelined cycle. (non-pipelined write are always followed by another non-pipelined cycle, because ready y is asserted before the earliest possible assertion of ads y for the next cycle.) figure 5-4 shows pipelined write and read cycles with one additional t rs state beyond the minimum required. to delay the assertion of ready y re- quires external logic. 5.3 mixed bus cycles when the math coprocessor bus logic is in the t rs state, it distinguishes between non-pipelined and pipelined cycles according to the behavior of ads y and ready y . in a non-pipelined cycle, only ready y is activated, and the transition is from the t rs state to the idle state. in a pipelined cycle, both ready y and ads y are active, and the transition is first from t rs state to t p state, then, after one clock period, back to t rs state. 240225 9 cycle 1 cycle 4 represent the operand transfer cycle for an instruction involving a transfer of two 32-bit loads in total. the opcode write cycles and other overhead are not shown. note that the next cycle will be a pipelined cycle if both ready y and ads y are sampled active at the end of a t rs state of the current cycle. figure 5-3. fastest transitions to and from pipelined cycles 30 30
intel387 tm sx math coprocessor 240225 10 note: 1. cycles between operand write to the math coprocessor and storing result. figure 5-4. pipelined cycles with wait states 31 31
intel387 tm sx math coprocessor 5.4 busy y and pereq timing relationship figure 5-5 shows the activation of busy y at the beginning of instruction execution and its deactiva- tion upon completion of the instruction. pereq is activated within this interval. if error y is ever as- serted, it would be asserted at least six cpuclk2 periods after the deactivation of pereq and would be deasserted at least six cpuclk2 periods before the deactivation of busy y . 240225 11 notes: 1. instruction dependent. 2. pereq is an asynchronous input to the intel386 tm microprocessor; it may not be asserted (instruction dependent). 3. more operand transfers. 4. memory read (operand) cycle is not shown. figure 5-5. sten, busy y , and pereq timing relationships 32 32
intel387 tm sx math coprocessor 6.0 package specifications 6.1 mechanical specifications the intel387 sx math coprocessor is packaged in a 68-pin plcc package. detailed mechanical specifi- cations can be found in the intel packaging specifi- cation, order number 231369. 6.2 thermal specifications the intel387 sx math coprocessor is specified for operation when the case temperature is within the range of 0 cto100 c. the case temperature (t c ) may be measured in any environment to determine whether the intel387 sx math coprocessor is within the specified operating range. the case temperature should be measured at the center of the top surface. the ambient temperature (t a ) is guaranteed as long as t c is not violated. the ambient temperature can be calculated from the i jc (thermal resistance con- stant from the transistor junction to the case) and i ja (thermal resistance from junction to ambient) from the following calculations: junction temperature t j e t c a p * i jc ambient temperature t a e t j b p * i ja case temperature t c e t a a p * ( i ja b i jc ) values for i ja and i jc are given in table 6-1 for the 68 pin plcc package. i jc is given at various air- flows. table 6-2 shows the maximum t a allowable without exceeding t c at various airflows. note that t a can be improved further by attaching a heat sink to the package. p is calculated by using the maxi- mum hot i cc and maximum v cc . table 6-1. thermal resistances ( c/watt) i jc and i ja package i jc i ja versus airflow - ft/min (m/sec) 0 200 400 600 800 1000 (0) (1.01) (2.03) (3.04) (4.06) (5.07) 68-pin plcc 8 30 25 20 15.5 13 12 table 6-2. maximum t a at various airflows package t a ( c) versus airflow - ft/min (m/sec) 0 200 400 600 800 1000 (0) (1.01) (2.03) (3.04) (4.06) (5.07) 68-pin plcc 84.9 88.3 91.8 94.8 96.6 97.2 maximum t a is calculated at maximum v cc and maximum i cc . 7.0 electrical characteristics the following specifications represent the targets of the design effort. they are subject to change without notice. contact your intel representative to get the most up-to-date values. 7.1 absolute maximum ratings * case temperature t c under bias 0 cto a 100 c storage temperature b 65 cto a 150 c voltage on any pin with respect to ground b 0.5 to v cc a 0.5 power dissipation0.8w notice: this is a production data sheet. the specifi- cations are subject to change without notice. * warning: stressing the device beyond the ``absolute maximum ratings'' may cause permanent damage. these are stress ratings only. operation beyond the ``operating conditions'' is not recommended and ex- tended exposure beyond the ``operating conditions'' may affect device reliability. 33 33
intel387 tm sx math coprocessor 7.2 d.c. characteristics table 7-1. d.c. specifications t c e 0 cto a 100 c, v cc e 5v g 10% symbol parameter min max units test conditions v il input lo voltage b 0.3 a 0.8 v (note 1) v ih input hi voltage 2.0 v cc a 0.3 v (note 1) v cl cpuclk2 and numclk2 input lo voltage b 0.3 a 0.8 v v ch cpuclk2 and numclk2 input hi voltage v cc b 0.8 v cc a 0.8 v v ol output lo voltage 0.45 v (note 2) v oh output hi voltage 2.4 v (note 3) v oh output hi voltage v cc b 0.8 v (note 4) i cc power supply current dynamic mode freq. e 33 mhz (5) 150 ma i cc typ. e 135 ma freq. e 25 mhz (5) 150 ma i cc typ. e 130 ma freq. e 20 mhz (5) 125 ma i cc typ. e 110 ma freq. e 16 mhz (5) 100 ma i cc typ. e 90 ma freq. e 1 mhz (5) 20 ma i cc typ. e 5ma idle mode (6) 7mai cc typ. e 4ma i li input leakage current g 15 m a0v s v in s v cc i lo i/o leakage current g 15 m a 0.45v s v o s v cc c in input capacitance 7 10 pf f c e 1 mhz c o i/o capacitance 7 12 pf f c e 1 mhz c clk clock capacitance 7 20 pf f c e 1 mhz notes: 1. this parameter is for all inputs, excluding the clock inputs. 2. this parameter is measured at i ol as follows: data e 4.0 ma readyo y , error y , busy y , pereq e 25 ma 3. this parameter is measured at i oh as follows: data e 1.0 ma readyo y , error y , busy y , pereq e 0.6 ma 4. this parameter is measured at i oh as follows: data e 0.2 ma readyo y , error y , busy y pereq e 0.12 ma 5. synchronous clock mode (ckm e 1). i cc is measured at steady state, maximum capacitive loading on the outputs, and worst-case d.c. level at the inputs. 6. intel387 sx math coprocessor internal idle mode. synchronous clock mode, clock and control inputs are active but the math coprocessor is not executing an instruction. outputs driving cmos inputs. 34 34
intel387 tm sx math coprocessor 7.3 a.c. characteristics table 7-2a. timing requirements of the bus interface unit t c e 0 cto a 100 c, v cc e 5v g 10% (all measurements made at 1.5v unless otherwise specified) pin symbol parameter 16 mhz 33 mhz conditions test refer to figure 25 mhz min max min max (ns) (ns) (ns) (ns) cpuclk2 t1 period 20 dc 15 dc 2.0v 7.2 cpuclk2 t2a high time 6 6.25 2.0v cpuclk2 t2b high time 3 4.5 v cc b 0.8v cpuclk2 t3a low time 6 6.25 2.0v cpuclk2 t3b low time 4 4.5 0.8v cpuclk2 t4 fall time 7 4 from v cc b 0.8v to 0.8v cpuclk2 t5 rise time 7 4 from 0.8v to v cc b 0.8v readyo y t7a out delay 4 25 4 17 c l e 50 pf 7.3 pereq t7b out delay 4 23 4 21 c l e 50 pf busy y t7c out delay 4 23 4 21 c l e 50 pf error y t7d out delay 4 23 4 23 c l e 50 pf d15 d0 t8 out delay 1 45 0 37 c l e 50 pf 7.4 d15 d0 t10 setup time 11 8 d15 d0 t11 hold time 11 8 d15 d0 t12 * float time 6 24 6 19 readyo y t13a * float time 1 40 1 30 7.6 pereq t13b * float time 1 40 1 30 busy y t13c * float time 1 40 1 30 error y t13d * float time 1 40 1 30 ads y t14a setup time 15 13 7.4 ads y t15a hold time 4 4 w/r y t14b setup time 15 13 w/r y t15b hold time 4 4 ready y t16a setup time 9 7 7.4 ready y t17a hold time 4 4 cmd0 y t16b setup time 16 13 cmd0 y t17b hold time 2 2 nps1 y , nps2 t16c setup time 16 13 nps1 y , nps2 t17c hold time 2 2 sten t16d setup time 15 13 sten t17d hold time 2 2 resetin t18 setup time 8 5 7.5 resetin t19 hold time 3 2 note: * float condition occurs when maximum output current becomes less than i lo in magnitude. float delay is not tested. 35 35
intel387 tm sx math coprocessor table 7-2b. timing requirements of the execution unit (asynchronous mode ckm e 0) pin symbol parameter 16 mhz 33 mhz conditions test refer to figure 25 mhz min max min max (ns) (ns) (ns) (ns) numclk2 t1 period 20 500 15 500 2.0v 7.2 numclk2 t2a high time 6 6.25 2.0v numclk2 t2b high time 3 4.5 v cc b 0.8v numclk2 t3a low time 6 6.25 2.0v numclk2 t3b low time 4 4.5 0.8v numclk2 t4 fall time 7 6 from v cc b 0.8v to 0.8v numclk2 t5 rise time 7 6 from 0.8v to v cc b 0.8v numclk2/ ratio 10/16 14/10 10/16 14/10 cpuclk2 note: if not used (ckm e 1) tie numclk2 low. table 7-2c. other a.c. parameters pin symbol parameter min max units resetin t30 duration 40 numclk2 resetin t31 resetin inactive to 1st 50 numclk2 opcode write busy y t32 duration 6 cpuclk2 busy y , error y t33 error y (in)active to 6 cpuclk2 busy y inactive pereq, error y t34 pereq inactive to 6 cpuclk2 error y active ready y , busy y t35 ready y active to busy y 0 4 cpuclk2 active ready y t36 minimum time from 4 cpuclk2 opcode write to opcode/operand write ready y t37 minimum time from 4 cpuclk2 operand write to operand write 36 36
intel387 tm sx math coprocessor 240225 12 note: * typical part under worst-case conditions. figure 7-1a. typical output valid delay vs load capacitance at max operating temperature 240225 13 240225 14 note: * typical part under worst-case conditions. figure 7-1b. typical output slew time vs load capacitance at max operating temperature 240225 15 figure 7-1c. maximum i cc vs frequency 37 37
intel387 tm sx math coprocessor 240225 16 figure 7-2. cpuclk2/numclk2 waveform and measurement points for input/output 240225 17 figure 7-3. output signals 38 38
intel387 tm sx math coprocessor 240225 18 figure 7-4. input and i/o signals 240225 19 note: the second internal processor phase following reset high to low transition is ph2. figure 7-5. reset signal 39 39
intel387 tm sx math coprocessor 240225 20 figure 7-6. float from sten 240225 21 * in numclk2's ** or last operand note: 1. memory read (operand) cycle is not shown. figure 7-7. other parameters 40 40
intel387 tm sx math coprocessor 8.0 intel387 sx math coprocessor instruction set instructions for the intel387 sx math coprocessor assume one of the five forms shown in table 8-1. in all cases, instructions are at least two bytes long and begin with the bit pattern 11011b, which identifies the escape class of instruction. instructions that refer to memory operands specify addresses using the cpu's addressing modes. mod (mode field) and r/m (register/memory spec- ifier) have the same interpretation as the corre- sponding fields of cpu instructions (refer to pro- grammer's reference manual for the cpu). sib (scale index base) byte and disp (displacement) are optionally present in instructions that have mod and r/m fields. their presence depends on the val- ues of mod and r/m, as for instructions of the cpu. the instruction summaries that follow in table 8-2 assume that the instruction has been prefetched, decoded, and is ready for execution; that bus cycles do not require wait states; that there are no local bus hold requests delaying processor access to the bus; and that no exceptions are detected during in- struction execution. if the instruction has mod and r/m fields that call for both base and index regis- ters, add one clock. table 8-1. instruction formats instruction optional fields first byte second byte 1 11011 opa 1 mod 1 opb r/m sib disp 2 11011 mf opa mod opb * r/m sib disp 3 11011 d p opa 1 1 opb * st(i) 4 11011 0 0 1 1 1 1 op 5 11011 0 1 1 1 1 1 op 1511 10 9 8 7 6 5 43210 op e instruction opcode, possibly split into two fields opa and opb mf e memory format 00 - 32-bit real 01 - 32-bit integer 10 - 64-bit real 11 - 16-bit integer d e destination 0 - destination is st(0) 1 - destination is st(i) r xor d e 0 - destination (op) source r xor d e 1 - source (op) destination * in fsub and fdiv, the low-order bit of opb is the r (reversed) bit p e pop 0 - do not pop stack 1 - pop stack after operation esc e 11011 st(i) e register stack element i 000 e stack top 001 e second stack element # # # 111 e eighth stack element 41 41
intel387 tm sx math coprocessor encoding clock count range instruction byte 0 byte 1 optional 32-bit 32-bit 64-bit 16-bit bytes 2 6 real integer real integer data transfer fld e load a integer/real memory to st(0) esc mf 1 mod 000 r/m sib/disp 11 20 28 44 20 27 42 53 long integer memory to st(0) esc 111 mod 101 r/m sib/disp 30 58 extended real memory to st(0) esc 011 mod 101 r/m sib/disp 16 47 bcd memory to st(0) esc 111 mod 100 r/m sib/disp 49 101 st(i) to st(0) esc 001 11000 st(i) 7 12 fst e store st(0) to integer/real memory esc mf 1 mod 010 r/m sib/disp 27 45 59 78 59 58 76 st(0) to st(i) esc 101 11010 st(i) 7 11 fstp e store and pop st(0) to integer/real memory esc mf 1 mod 011 r/m sib/disp 27 45 59 78 59 58 76 st(0) to long integer memory esc 111 mod 111 r/m sib/disp 64 86 st(0) to extended real memory esc 011 mod 111 r/m sib/disp 50 56 st(0) to bcd memory esc 111 mod 110 r/m sib/disp 116 194 st(0) to st(i) esc 101 11011 st (i) 7 11 fxch e exchange st(i) and st(0) esc 001 11001 st(i) 10 17 comparison fcom e compare integer/real memory to st(0) esc mf 0 mod 010 r/m sib/disp 15 27 36 54 18 31 39 62 st(i) to st(0) esc 000 11010 st(i) 13 21 fcomp e compare and pop integer/real memory to st(0) esc mf 0 mod 011 r/m sib/disp 15 27 36 54 18 31 39 62 st(i) to st(0) esc 000 11011 st(i) 13 21 fcompp e compare and pop twice st(1) to st(0) esc 110 1101 1001 13 21 ftst e test st(0) esc 001 1110 0100 17 25 fucom e unordered compare esc 101 11100 st(i) 13 21 fucomp e unordered compare and pop esc 101 11101 st(i) 13 21 fucompp e unordered compare and pop twice esc 010 1110 1001 13 21 fxam e examine st(0) esc 001 1110 0101 24-37 shaded areas indicate instructions not available in 8087/80287. note: a. when loading single or double precision zero from memory, add 5 clocks. 42 42
intel387 tm sx math coprocessor encoding clock count range instruction byte 0 byte 1 optional 32-bit 32-bit 64-bit 16-bit bytes 2 6 real integer real integer arithmetic fadd e add integer/real memory to st(0) esc mf 0 mod 000 r/m sib/disp 14 31 36 58 19 38 38 64 st(i) and st(0) es cdp0 11000 st(i) sib/disp 12 26 b fsub e subtract integer/real memory with st(0) esc mf 0 mod 10 r r/m sib/disp 14 31 36 58 19 38 38 64 c st(i) to st(0) es cdp0 1110 r r/m 12 26 d fmul e multiply integer/real memory with st(0) esc mf 0 mod 001 r/m sib/disp 21 33 45 73 27 57 46 74 st(i) and st(0) es cdp0 1100 1 r/m 17 50 e fdiv e divide integer/real memory with st(0) esc mf 0 mod 11 r r/m sib/disp 79 87 103 116 f 85 95 105 124 g st(i) and st(0) es cdp0 1111 r r/m 77 80 h fsqrt i e square root esc 001 1111 1010 97 111 fscale e scale st(0) by st(1) esc 001 1111 1101 44 82 fprem e partial remainder esc 001 1111 1000 56 140 fprem1 e partial remainder (ieee) esc 001 1111 0101 81 168 frndint e round st(0) to integer esc 001 1111 1100 41 62 fxtract e extract components of st(0) esc 001 1111 0100 42 63 fabs e absolute value of st(0) esc 001 1110 0001 14 21 fchs e change sign of st(0) esc 001 1110 0000 17 24 transcendental fcos k e cosine of st(0) esc 001 1111 1111 122 680 fptan k e partial tangent of st(0) esc 001 1111 0010 162 430 j fpatan e partial arctangent of st(0) esc 001 1111 0011 250 420 fsin k e sine of st(0) esc 001 1111 1110 121 680 fsincos k e sine and cosine of st(0) esc 001 1111 1011 150 650 f2xm1 l e 2 st(0) b 1 esc 001 1111 0000 167 410 fyl2x m e st(1) * log 2 st(0) esc 001 1111 0001 99 436 fyl2xp1 n e st(1) * log 2 [ st(0) a 1.0 ] esc 001 1111 1001 210 447 shaded areas indicate instructions not available in 8087/80287. notes: b. add 3 clocks to the range when d e 1. c. add 1 clock to each range when r e 1. d. add 3 clocks to the range when d e 0. e. typical e 52 (when d e 0, 46 54, typical e 49). f. add 1 clock to the range when r e 1. g. 135 141 when r e 1. h. add 3 clocks to the range when d e 1. i. b 0 s st(0) s a % . j. these timings hold for operands in the range l x l k q . for operands not in this range, up to 76 additional clocks may be needed to reduce the operand. k. 0 s st(0) k 2 63 . l. b 1.0 s st(0) s 1.0. m. 0 s st(0) k % , b % k st(1) k a % . n. 0 s l st(0) l k [ 2-sqrt(2) ] /2, b % k st(1) k a % . 43 43
intel387 tm sx math coprocessor encoding clock count range instruction byte 0 byte 1 optional 32-bit 32-bit 64-bit 16-bit bytes 2 6 real integer real integer constants fldz e load a 0.0 to st(0) esc 001 1110 1110 10 17 fld1 e load a 1.0 to st(0) esc 001 1110 1000 15 22 fldpi e load q to st(0) esc 001 1110 1011 26 36 fldl2t e load log 2 (10) to st(0) esc 001 1110 1001 26 36 fldl2e e load log 2 (e) to st(0) esc 001 1110 1010 26 36 fldlg2 e load log 10 (2) to st(0) esc 001 1110 1100 25 35 fldln2 e load log e (2) to st(0) esc 001 1110 1101 26 38 processor control finit e initialize math coprocessor esc 011 1110 0011 33 fldcw e load control word from memory esc 001 mod 101 r/m sib/disp 19 fstcw e store control word to memory esc 001 mod 111 r/m sib/disp 15 fstsw e store status word to memory esc 101 mod 111 r/m sib/disp 15 fstsw ax e store status word to ax esc 111 1110 0000 13 fclex e clear exceptions esc 011 1110 0010 11 fstenv e store environment esc 001 mod 110 r/m sib/disp 117 118 fldenv e load environment esc 001 mod 100 r/m sib/disp 85 fsave e save state esc 101 mod 110 r/m sib/disp 402 403 frstor e restore state esc 101 mod 100 r/m sib/disp 415 fincstp e increment stack pointer esc 001 1111 0111 21 fdecstp e decrement stack pointer esc 001 1111 0110 22 ffree e free st(i) esc 101 1100 0 st(i) 18 fnop e no operations esc 001 1101 0000 12 44 44
intel387 tm sx math coprocessor appendix a intel387 sx math coprocessor compatibility a.1 8087/80287 compatibility this section summarizes the differences between the intel387 sx math coprocessor and the 80287 math coprocessor. any migration from the 8087 directly to the intel387 sx math coprocessor must also take into account the differences between the 8087 and the 80287 math coprocessor as listed in appendix b. many changes have been designed into the intel387 sx math coprocessor to directly support the ieee standard in hardware. these changes result in increased performance by eliminating the need for software that supports the standard. a.1.1 general differences the intel387 sx math coprocessor supports only affine closure for infinity arithmetic, not projective closure. operands for fscale and fpatan are no longer restricted in range (except for g % ); f2xm1 and fptan accept a wider range of operands. rounding control is in effect for fld constant. software cannot change entries of the tag word to values (other than empty) that differ from actual register contents. after reset, finit, and incomplete fprem, the intel387 sx math coprocessor resets to zero the condition code bits c 3 c 0 of the status word. in conformance with the ieee standard, the intel387 sx math coprocessor does not support the special data formats pseudo-zero, pseudo-nan, pseudo-infinity, and unnormal. the denormal exception has a different purpose on the intel387 sx math coprocessor. a system that uses the denormal exception handler solely to normalize the denormal operands, would better mask the denormal exception on the intel387 sx math coprocessor. the intel387 sx math coprocessor automatically normalizes denormal operands when the denormal exception is masked. a-1 45
intel387 tm sx math coprocessor a.1.2 exceptions a number of differences exist due to changes in the ieee standard and to functional improvements to the architecture of the intel387 sx math coprocessor: 1. when the overflow or underflow exception is masked, the intel387 sx math coprocessor differs from the 80287 in rounding when overflow or underflow occurs. the intel387 sx math coprocessor produces results that are consistent with the rounding mode. 2. when the underflow exception is masked, the intel387 sx math coprocessor sets its underflow flag only if there is also a loss of accuracy during denormalization. 3. fewer invalid-operations exceptions due to denormal operand, because the instructions fsqrt, fdiv, fprem, and conversions to bcd or to integer normalize denormal operands before proceeding. 4. the fsqrt, fbstp, and fprem instructions may cause underflow, because they support denormal operands. 5. the denormal exception can occur during the transcendental instruction and the fxtract instruction. 6. the denormal exception no longer takes precedence over all other exceptions. 7. when the denormal exception is masked, the intel387 sx math coprocessor automatically normalizes denormal operands. the 8087/80287 performs unnormal arithmetic, which might produce an unnormal result. 8. when the operand is zero, the fxtract instruction reports a zero-divide exception and leaves b % in st(1). 9. the status word has a new bit (sf) that signals when invalid-operation exceptions are due to stack underflow or overflow. 10. fld extended precision no longer reports denormal exceptions, because the instruction is not numeric. 11. fld single/double precision when the operand is denormal converts the number to extended precision and signals the denormal operand exception. when loading a signaling nan, fld single/double precision signals an invalid-operation exception. 12. the intel387 sx math coprocessor only generates quiet nans (as on the 80287); however, the intel387 sx math coprocessor distinguishes between quiet nans and signaling nans. signaling nans trigger exceptions when they are used as operands; quiet nans do not (except for fcom, fist, and fbstp which also raise ie for quiet nans). 13. when stack overflow occurs during fptan and overflow is masked, both st(0) and st(1) contain quiet nans. the 80287/8087 leaves the original operand in st(1) intact. 14. when the scaling factor is g % , the fscale instruction behaves as follows: # fscale (0, % ) generates the invalid operation exception. # fscale (finite, b % ) generates zero with the same sign as the scaled operand. # fscale (finite, a % ) generates % with the same sign as the scaled operand. the 8087/80287 returns zero in the first case and raises the invalid-operation exception in the other cases. 15. the intel387 sx math coprocessor returns signed infinity/zero as the unmasked response to massive overflow/underflow. the 8087 and 80287 support a limited range for the scaling factor; within this range either massive overflow/underflow do not occur or undefined results are produced. a-2 46
intel387 tm sx math coprocessor appendix b compatibility between the 80287 and 8087 math coprocessor the 80286/80287 operating in real address mode will execute 8086/8087 programs without major modifica- tion. however, because of differences in the handling of numeric exceptions by the 80287 math coprocessor and the 8087 math coprocessor, exception handling routines may need to be changed. this appendix summa- rizes the differences between the 80287 math coprocessor and the 8087 math coprocessor, and provides details showing how 8087/8087 programs can be ported to the 80286/80287. 1. the math coprocessor signals exceptions through a dedicated error y line to the 80286. the math coprocessor error signal does not pass through an interrupt controller (the 8087 int signal does). there- fore, any interrupt controller oriented instructions in numeric exception handlers for the 8086/8087 should be deleted. 2. the 8087 instructions feni and fdisi perform no useful function in the 80287. if the 80287 encounters one of these opcodes in its instruction stream, the instruction will effectively be ignored; none of the 80287 internal states will be updated. while 8086/8087 programs containing the instruction may be executed on the 80286/80287, it is unlikely that the exception handling routines containing these instructions will be completely portable to the 80287. 3. interrupt vector 16 must point to the numeric exception handling routine. 4. the esc instruction address saved in the 80287 includes any leading prefixes before the esc opcode. the corresponding address saved in the 8087 does not include leading prefixes. 5. in protected address mode, the format of the 80287's saved instruction and address pointers is different than for the 8087. the instruction opcode is not saved in protected mode; exception handlers will have to retrieve the opcode from memory if needed. 6. interrupt 7 will occur in the 80286 when executing esc instructions with either ts (task switched) or em (emulation) of the 80286 msw set (ts e 1orem e 1). it ts is set, then a wait instruction will also cause interrupt 7. an exception handler should be included in 80286/80287 code to handle these situations. 7. interrupt 9 will occur if the second or subsequent words of a floating point operand fall outside a segment's size. interrupt 13 will occur if the starting address of a numeric operand falls outside a segment's size. an exception handler should be included in 80286/80287 code to report these programming errors. 8. except for the processor control instructions, all of the 80287 numeric instructions are automatically syn- chronized by the 80286 cpu; the 80286 cpu automatically tests the busy y line from the 80287 to ensure that the 80287 has completed its previous instruction before executing the next esc instruction. no explicit wait instructions are required to assure this synchronization. for the 8087 used witth 8086 and 8088 processors, explicit waits are required before each numeric instruction to ensure synchronization. al- though 8086/8087 programs having explicit wait instructions will execute perfectly on the 80286/80287 without reassembly, these wait instructions are unnecessary. 9. since the 80287 does not require wait instructions before each numeric instruction, the asm286 assem- bler does not automatically generate these wait instuctions. the asm86 assembler, however, automati- cally precedes every esc instruction with a wait instruction. although numeric routines generated using the asm86 assembler will generally execute correctly on the 80286/80287, reassembly using asm286 may result in a more compact code image. the processor control instructions for the 80287 may be coded using either a wait or no-wait form of mnemonic. the wait forms of these instructions cause asm286 to precede the esc instructions with a cpu wait instruction, in the identical manner as does asm86. b-1 47

▲Up To Search▲

Price & Availability of INTEL387SX

	To Download INTEL387SX Datasheet File
If you can't view the Datasheet, Please click here to try to view without PDF Reader .